Add an option to return only chunks to LLMs#168
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #168 +/- ##
=======================================
Coverage 99.28% 99.28%
=======================================
Files 22 22
Lines 1534 1534
=======================================
Hits 1523 1523
Misses 11 11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi, thanks for this PR! I haven't done this because of #146 . I'll get back to this once I fix that. |
|
While we're at it, do you think it makes sense to give the option to the LLM? We could make this a boolean parameter that the LLM can call, so that it can decide whether it wants chunks or full documents. The downside of it is that the LLM doesn't know the project before the func call, so its decisions might be bad and unreliable. |
|
Yeah, I don't think it makes sense to expose that directly to an LLM -- whether it makes sense to return the chunk or the document is really going to be dependent on the size of the full document responses. It's not something I would expect the LLM to have enough context to make an intelligent decision on. |
We might be able to utilise codecompanion's token count feature for this. The problem is how we can determine the max context window so that the token count actually makes sense (for example, a 100k-token conversation is close to being saturated for a 128k LLM, but is still very usable for a 1M LLM). I also have this thought about using an auxiliary LLM as a response-rewriter that paraphrases the document content into concise, descriptive paragraphs as the response. Paraphrasing, compared to coding, is a simpler task and can hopefully be handled well enough by a small, cheap (or free) LLM. If it works, it could save the token count for the main LLM, allowing users to have longer conversations with the main coder LLM, and hopefully reduce the cost as well. |
53a5390 to
293dac5
Compare
293dac5 to
8abdfc2
Compare
Davidyz
left a comment
There was a problem hiding this comment.
Apart from the comments in the changes, I think we should also make max_num and default_num tables (something like {document=10, chunk=100}), so that users can set different default_num and max_num for document and chunk mode. The chunk length and document length can differ by a lot, so it makes sense to set different numbers for them.
Also, to make sure it's backward-compatible, we should add a check that converts the old format of the config to the new format, and throw a warning via vim.deprecate (if you're not familiar with the syntax, this is how I do it). I'll remove the backward-compatibility stuff before making 0.7.0 release.
| local args = { "query" } | ||
| vim.list_extend(args, action.options.query) | ||
| vim.list_extend(args, { "--pipe", "-n", tostring(action.options.count) }) | ||
| vim.list_extend(args, { "--include", "path", "chunk", "document" }) |
There was a problem hiding this comment.
The chunk and document options in the --include flag are exclusive. You can have only one of them. path can stay there.
| auto_submit = { ls = false, query = false }, | ||
| ls_on_start = false, | ||
| no_duplicate = true, | ||
| only_chunks = false, |
There was a problem hiding this comment.
Since the chunk and document options can never co-exist, maybe rename this to chunk_mode?
8abdfc2 to
8aa380b
Compare
|
I'll checkout from this branch to refactor some of the code to prepare the codebase for #179 . All your commits will stay. I'll also make the changes that I mentioned here. |
When files are very large, returning the entire file can quickly blow out the context, even with a small number of results returned. This PR adds the option (specified in
tool_opts) to only return chunks to the CodeCompanion LLM --only_chunks. This allows the LLM to decide whether to request the entire file based on the chunk and whether it actually seems relevant to the question.I've only added this to the
CodeCompanionbackend for now (just because I don't haveCopilotChatconfigured to test it).StringChunkerdoesn't handle multi-line text. #146